Genome-wide association analysis to delineate high-quality SNPs for seed micronutrient density in chickpea (Cicer arietinum L.)

Chickpea is the most important nutrient-rich grain legume crop in the world. A diverse core set of 147 chickpea genotypes was genotyped with a Axiom(®)50K CicerSNP array and trait phenotyped in two different environments for four seed micronutrients (Zn, Cu, Fe and Mn). The trait data and high-throughput 50K SNP genotypic data were used for the genome-wide association study (GWAS). The study led to the discovery of genes/QTLs for seed Zn, Cu, Fe and Mn, concentrations in chickpea. The analysis of seed micronutrient data revealed significant differences for all four micronutrient concentrations (P ≤ 0.05). The mean concentrations of seed Zn, Cu, Fe and Mn pooled over the 2 years were 45.9 ppm, 63.8 ppm 146.1 ppm, and 27.0 ppm, respectively. The analysis of results led to the identification of 35 SNPs significantly associated with seed Zn, Cu, Fe and Mn concentrations. Among these 35 marker-trait associations (MTAs), 5 were stable (consistently identified in different environments), 6 were major (explaining more than 15% of the phenotypic variation for an individual trait) and 3 were both major and stable MTAs. A set of 6 MTAs, MTAs (3 for Mn, 2 for Fe, and 1 for Cu) reported by us during the present study have been also reported in the same/almost same genomic regions in earlier studies and therefore declared as validated MTAs. The stable, major and validated MTAs identified during the present study will prove useful in future chickpea molecular breeding programs aimed at enhancing the seed nutrient density of chickpea.

. Similarly, analysis of seed micronutrient concentrations in Year II revealed that the Zn concentration ranged from 9.36 to 205.86 ppm (mean 46.58 ppm), Cu concentration ranged from 3.66 to 190.36 ppm (mean 66.81 ppm), Fe concentration ranged from 100.36 to 127.2 ppm (mean 132.29 ppm) and Mn concentration ranged from 4.86 to 122.86 ppm (mean 28.68 ppm) ( Table 1). The mean values of Zn, Cu, Fe and Mn concentrations in Environment III (pooled data) are presented in Fig. 1. Levene's test conducted for the trait data of two different environments of seed Zn, Cu, Fe and Mn concentrations revealed non-significant differences in variances (P for Zn = 0.48; P for Cu = 0.99; P for Fe = 0.99 and P value for Mn = 0.98) of the trait data of two environments/experiments. Thus, the results indicated that the variances are homogeneous between the two experiments. Combined ANOVA indicated significant interactions (P ≤ 0.5%) between genotype and environment (year) for all four micronutrients (Zn, Cu, Fe and Mn). The mean micronutrient concentrations of the genotypes varied significantly. The maximum coefficient of variation was observed for seed Mn (2.68%), followed by Zn (1.58%), Cu (1.40%) and Fe (0.52%) ( Table 2).
Single nucleotide polymorphism (SNP) distribution on the chickpea genome. We genotyped the 147 chickpea genotypes with an Axiom ® CicerSNP Array comprising 50,590 SNPs, and SNPs were called following the Axiom best practice workflow 29 . We excluded SNPs with ambiguous bases as well as all loci with more than 10% missing data. Furthermore, we also excluded monomorphic SNPs and SNPs with a PIC value < 0.349%.

Diversity analysis.
We identified two subpopulations in the chickpea core set based on principal component analysis (PCA) (Fig. 3). Analysis of molecular variance (AMOVA) among the two subpopulations identified based on PCA indicated that the major source of variance was the genotypes, not the populations. The  www.nature.com/scientificreports/ variation of 7% was due to populations (between populations), and 93% variation was within the population, i.e., due to genotype alone (Supplementary Table S1). The maximum number of private alleles (398) was present in the indigenous population, exhibiting that the indigenous genotypes are diverse in comparison to the exotic ones (Supplementary Table S2). A diversity analysis study was conducted to estimate important diversity parameters of the core set. The summary of the results of diversity statistics presented in Table 4 shows that population-1 is more diverse than population-2, with higher values for diversity parameters such as gene diversity "He" (major index of diversity).
Stable/promising and major MTAs. The MTAs identified in more than one environment were grouped as stable/promising MTAs, while MTAs with greater than 15% phenotypic variation were grouped as major MTAs. Out of the 35 identified associated SNPs, 5 SNPs were stable/promising, 6 MTAs were found to be major MTAs, and 3 MTAs were both stable and major. Among the stable/promising MTAs, 4 MTAs were for Cu (Affx_123261919, Affx_123255840, Affx_123294330 and Affx_123280871), explaining 15.6%, 17.3%, 14.05% and 18.25% of the average phenotypic variation with a P value range of 4.8 × 10 -5 to 9.7 × 10 -4 , respectively, and one MTA (located on LG 4) for Zn (Affx_123261732, P-value = 4.5 × 10 -4 -8.0 × 10 -4 ) was identified with 10.1% of the average phenotypic variation, while no MTAs were found for Fe and Mn. A set of 6 major MTAs (explaining more than 15% of phenotypic variation) were identified, of which 5 MTAs were for Cu (Affx_123259344, Affx_123261919, Affx_123255840, Affx_123280871 and Affx_123280871). The remaining major MTA (Affx_123275255) was identified for Fe, with an average phenotypic variation of 19.9%. Out of 5 major MTAs identified for seed Cu concentration, 3 MTAs (Affx_123261919, Affx_123255840 and Affx_123280871) were both stable and major. Affx_123280871 explained the maximum phenotypic variation (18.25%). The list of these markers and their chromosomal location and association with the trait of interest is presented in Table 6.

Discussion
One of the major global health issues that need immediate attention is micronutrient malnutrition 38 . Iron and zinc are indispensable micronutrients in the human diet. By acting as cofactors for proteins, including hemoglobin, transcription factors and cytochrome, they play pivotal roles in growth and development. Their deficiency leads to physiological disorders such as tissue hypoxia, anemia, acrimal dermatitis, and hypogonadism. Copper plays an important role in metabolism in both plants and animals 39 . Its deficiency in the human diet causes several abnormalities in important life processes, such as collagen and elastin biosynthesis, iron mobilization, and wound healing 40 . To overcome or mitigate this problem of micronutrient malnutrition, strategies such as mineral supplementation, dietary diversification and biofortification need to be implemented 41 . Chickpea is the most important nutrient-rich grain legume crop, and incorporation of this crop in human diets is considered a very good strategy for diet diversification and alleviating micronutrient malnutrition 10-13 . www.nature.com/scientificreports/ Recent advances in genomics and next-generation sequencing have enabled the availability of ample genomics resources, making crop genomics resources rich. Conventional QTL mapping approaches to identify genomic regions have been restricted to biparental mapping populations at less resolution 42 . The availability of reliable markers will help in tailoring climate-smart and nutrient-rich varieties that cater to both the demand and health aspects of humans through genomics-assisted breeding programs. The availability of genetic and genomic resources will pave the way for the discovery of useful/stable/major genes/QTLs/markers for seed micronutrients in chickpea through a variety of approaches. A variety of molecular markers have been developed in almost all crop plants and these markers have been used for variety of purposes including the study of genetic diversity, study of population structure and also in gene discovery programs [43][44][45][46][47][48][49][50][51] ). However, despite the abundance of different types of markers available, SNPs have witnessed exponential utilization for a variety of diverse applications since their discovery 52 .   www.nature.com/scientificreports/ Our trait variability results indicated that the genotype × environment interactions were significant, which indicates that the significant proportion of seed micronutrient concentrations in chickpea is dependent on the soil conditions, temperature, precipitation, and various crop management practices and that significant variation also exists between the genotypes. A similar effect of location and year on seed micronutrient concentrations has been reported in rice 53 , field pea, chickpea, common bean and lentil 12 . In the present study, the concentrations of different micronutrients obtained were similar to the results observed in earlier studies 5,[9][10][11]30,54,55 . The candidate lines identified for different micronutrients should be used in chickpea breeding programs and would also serve as useful genetic resources for gene discovery and transcriptomic studies. Diversity analysis was conducted to identify molecular genetic variations present in chickpea germplasm, revealing that indigenous chickpea genotypes are more diverse in comparison to exotic ones, which may be due to the genetic bottleneck in exotic lines or a lower number of lines included in the present study. The availability of lines carrying contrasting genetic variations with different genetic parameters will benefit breeders in chickpea breeding programs. The PCA clustered the germplasm set into two groups, indicating the existence of two subpopulations for the set of genotypes used. Similar results have been reported in some other earlier studies in chickpea making use of SSR markers 56,57 .
We report significant MTAs associated with key micronutrients through SNP data based on Axiom ® CicerSNP Array and 2 years of phenotyping of core collections of chickpea. To optimize the results, we utilized 7% high-quality SNPs (PIC > 0.349%) after deep filtration to precisely map and identified 35 highly associated SNPs for four seed micronutrients (9 each with Fe and Mn, 5 for Zn, and 12 for Cu) in chickpea with an average of ~ 9 SNPs/trait. Quantitative trait loci for seed micronutrients have also been identified in different earlier studies in chickpea (2 QTLs for Cu 54 and 24 QTLs for Zn and Fe 30,55 ). In addition, several important genes/QTLs for different micronutrients (Zn, Fe, Cu, Mg, P, K and Mn) have also been reported in some other earlier studies in chickpea [5][6][7] . For seed Cu concentration, we identified significant MTAs on LG1, 2, 3, 4, 5, 6 and 8, while as MTAs were identified on LG5 and LG7 only in an earlier study 5 . For seed Zn concentrations, we identified significant MTAs on LG1, 4 and 7, while as in an earlier study, MTAs have been reported on LG4 and LG5 7 . For seed Fe concentration, we identified MTAs on LG1, 3, 4, 5, 6, 7 and 8, while as MTAs have also been reported for seed Fe concentrations on LG1,4 and LG5 in an earlier study 7 . For Mn concentration, we identified significant MTAs on LG1, 2, 3, 4 and 7, while MTAs have been reported on LG1, 2, 3, 4, 6 and 7 in an earlier study 6 . The comparison of the results revealed that several unique and new MTAs on different chromosomes not reported on these chromosomes in earlier studies have been reported during the present study (Table 5). On the other hand, a set of six MTAs (2 for Fe, 3 for Mn and 1 for Cu) reported by us during the present study have been also reported in the same/almost same genomic regions in earlier studies [5][6][7] (Table 5). Therefore, these MTAs have been declared as validated MTAs. All 35 SNPs identified during the present study were detected on all eight chickpea linkage groups/chromosomes (LG1-LG8). Genomic regions identified in the present study have also been flanked by QTLs responsible for seed micronutrients in chickpea; for example, QTLs for seed Fe were present on Ca1, Ca3, Ca4, Ca5 and Ca7 10,30,54 . This could be due to the different genetic backgrounds of the germplasm used in our study. The present study also identified QTLs at different chromosomal positions not reported earlier, suggesting that we have identified novel QTLs linked to seed micronutrients in chickpea. QTLs for all four micronutrients have also been mapped in different crops, including lentil for Zn (on chromosome 1, 2, 3, and 4), and Fe (on chromosome 3, 5 and 6) 58 ; pea for Zn (on chromosome 2, 3, 5 and 7), Fe (on chromosome 2, 3, 5, 6 and 7), and Mn (on chromosome 1, 2, 4, 5, 6 and 7) 59 ; and Medicago truncatula for Zn (on chromosomes 1, 2, 3, 4, 5, and 7), Fe (on chromosomes 7 and 8), Cu (on chromosome 1, 2, 4 and 7), and Mn (on chromosomes 1, 2, 3, 5, 6 and 7) 60 . The markers found to have a significant association with the trait should be used in marker-assisted selection for biofortified chickpea to mitigate the problem of micronutrient malnutrition or hidden hunger.

Conclusions
Our study is the first of its kind that presents the features of a large population of chickpea genotypes in different environments grown in the region of Kashmir, providing an understanding of the influence of environmental and growing conditions on the seed micronutrient concentrations of a crop. This study provides insight into the genetic basis of variability for seed micronutrient concentrations in chickpea and the potential of GWAS in unravelling marker trait associations in economically important crops. Because of their importance, stable MTAs should be recommended for chickpea molecular breeding programs aimed at enhancing the seed micronutrient concentrations of local chickpea landraces preferred by consumers. The stable and major MTAs identified during the present study will prove useful in breeding programs aimed at enhancing seed micronutrient concentrations in chickpea.

Methods
The plant material used, trait data recorded, genotyping done and data analysis conducted in this study has been presented under separate headings below. All the methods in this manuscript were carried out in accordance with relevant guidelines and regulations.
Plant material. We prepared a chickpea core set comprising 147 genotypes (128 Desi, 12 Kabuli and 7 intermediate pea shaped) based on SSR marker data generated on 384 composite sets 56 (Supplementary Table S3 www.nature.com/scientificreports/ important to mention that "The material was not collected by us but was procured from the national/international institutions after signing a proper Material Transfer Agreement (MTA) with these institutions".
Trait phenotyping. Trait phenotyping to quantify the seed micronutrient concentration for four micronutrients Zn, Fe, Cu, and Mn was performed using the core set, grown in two replicated field trials arranged in an augmented block design with a planting density of 10 seeds per row per genotype in rows of 4 m long with a row-to-row spacing of 30 cm. The trial was conducted for two consecutive years, 2017 and 2018, in the research fields of the Faculty of Agriculture (FoA), SKUAST-K, Wadura Campus, Sopore, Kashmir, India. The concentrations of four micronutrients, Zn, Fe, Cu, and Mn, were determined following the procedure described by Ref. 61 . Healthy mature harvested seeds from each genotype, free from any impurity, were surface cleaned with doubledistilled water and dried in an oven at 55-60 °C for 48 h. The dry seeds were ground into powder using a ball mill (MM400). For each genotype, one gram of powder was weighed in an Erlenmeyer flask, and acid digestion was conducted on a hot plate (250 °C) by adding 10 ml of HNO 3 -HClO 4 (9:4) solution to each flask 61 . The digested mixture was filtered through Whatman No. 40 filter paper, and using double distilled water, the total volume was made up to 50 ml. The chemicals used in the digestion of the samples were of analytical grade. The samples were analyzed against standards of known concentration. The concentrations of nutrients were determined using an atomic absorption spectrophotometer (AAS) (Agilent Technologies, 200 series). The concentrations of micronutrients present were estimated and expressed as mg/kg (ppm) of seed. Three independent replications were noted for each accession, and their mean was used for the statistical analysis.
Genome-wide marker genotyping/whole-genome scanning. DNA extraction and purification. The genomic DNA of 147 chickpea genotypes was isolated and purified using a high-throughput DNeasy Plant Mini Kit by Qiagen (www. qiagen. com/ handb ooks) following the manufacturer's instructions. For DNA isolation, leaf samples from three random healthy plants for each genotype were taken. After purification, the DNA was dissolved in 100 μl Tris-EDTA (TE) buffer. A NanoDrop (ND 1000) was used for the quantification of the isolated DNA samples, and the purity of the samples was analyzed by agarose gel electrophoresis using 0.8% agarose gel in the presence of Tris-acetate-EDTA (1× TAE) buffer.
SNP genotyping using 50K SNP Axiom Cicer SNP Array. The purified high-quality DNA (100 ng µl −1 ) from a set of 147 chickpea genotypes was further used for genotyping using 50K SNP array 37 at the Centre of Excellence in Genomics and Systems Biology (CEGSB) Lab Facilities, ICRISAT, Patancheru, Hyderabad. The total genomic DNA before hybridization was fragmented (25-125 bp) followed by purification and hybridized with the 50K SNP Axiom CicerSNP Array to generate the genotyping data. The nonspecific random ligations bound to the targets were washed off under stringent conditions. The Gene Titan-Multi-Channel instrument was used to stain, image and process the array to track to the multicolor ligation events at the array surface, which points toward the polymorphic nucleotides and ultimately generate the data.

Data analysis. Analysis of variance.
Based on the mixed-effect model for three different replications and 2 years using the statistical functions package of R software according to 5% significance levels, analysis of variance (ANOVA) was conducted to analyze differences in the seed nutrient concentrations. The diverse statistical measures, including minimum, maximum, average value, standard deviation and coefficient of variation (CV), among the diverse accessions were measured. The effect of genotype, environment (year) and their combined effect (G × E) was also analysed. Levene's test was also conducted to test the homogeneity of variance of genotypes in two different environments/experiments using an established procedure in Microsoft Excel (https:// www. youtu be. com/ watch?v= HoYos 9IwZNY). The test is used to test whether the trait data collected in two different experiments/environments are significantly different. A P value < 0.05 indicates significant differences between the variances (heterogeneous variances), while P > 0.05 indicates nonsignificant differences between the trait values of two experiments (homogenous variances).
Diversity analysis and population differentiation. Diversity analysis was performed using genotypic data generated by a set of 7277 SNPs on 147 chickpea genotypes. The software program GenAlEx version 6.5 62 was utilized to statistically analyze the genotypic data, and parameters such as the total number of alleles (Na), the number of effective alleles (Ne) and Shannon's information index were calculated. To conduct principal coordinate analysis (PCA), the genotypic data (multiple loci and multiple genotypes) were also analyzed by GenAlEx version 6.5 software. For the construction of the PCA plots, a pairwise genetic distance matrix was calculated. The same software program GenAlEx version 6.5 was used to compute the other genetic diversity parameters, such as genetic identity, gene flow (Nm), and pairwise Nei's unbiased genetic distance. Using the software package Power-Marker version 3.25 63 , we calculated the polymorphic information content (PIC) for each SNP marker locus.
Analysis of molecular variance (AMOVA) and genetic diversity indices. To calculate the level of genetic variation not only among the genotypes but also within the populations, analysis of molecular variance (AMOVA) was performed. The genetic distance matrix (same as used for PCA) was utilized for conducting AMOVA. The software program GenAlEx6.5 was utilized to perform the AMOVA.
Genome-wide association analyses. To identify markers associated with seed Zn, Fe, Cu, and Mn concentrations, agenome-wide association study (GWAS) was conducted. GAPIT (The Genome Association and Predic- www.nature.com/scientificreports/ tion Integrated Tool) version 3.0, R package (http:// zzlab. net/ GAPIT) was used for analyzing the trait phenotyping data (concentration of micronutrients analyzed through AAS) and genotyping data (obtained through the use of Axiom Cicer SNP Array). The model used in GWAS was MLM (mixed linear model), as it has proven quite useful for controlling the relatedness and population structure within GWAS. The population structure in MLM is considered a fixed effect, whereas kinship among individuals is incorporated as the variance-covariance structure of the random effect for the individuals 64 . SNPs showing minor allele frequencies below 0.05 were eliminated from the dataset. In total, 41,489 polymorphic markers were retained for the 147 genotyped accessions. To show the distribution of the P-values for the SNP markers, Manhattan plots were generated (Figs. 4, 5, 6, 7). The significant association between SNP and trait of interest was declared at a threshold value of P (≥ 10 -3 ).